Overview

Dataset statistics

Number of variables31
Number of observations38248
Missing cells144663
Missing cells (%)12.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory9.0 MiB
Average record size in memory248.0 B

Variable types

Numeric15
Categorical16

Alerts

dat_cadastramento_fam has a high cardinality: 3487 distinct values High cardinality
dat_alteracao_fam has a high cardinality: 962 distinct values High cardinality
dat_atualizacao_familia has a high cardinality: 1088 distinct values High cardinality
nom_estab_assist_saude_fam has a high cardinality: 1345 distinct values High cardinality
nom_centro_assist_fam has a high cardinality: 222 distinct values High cardinality
cd_ibge is highly correlated with classf and 2 other fieldsHigh correlation
estrato is highly correlated with classfHigh correlation
classf is highly correlated with cd_ibge and 2 other fieldsHigh correlation
id_familia is highly correlated with classfHigh correlation
vlr_renda_media_fam is highly correlated with marc_pbfHigh correlation
cod_local_domic_fam is highly correlated with cod_destino_lixo_domic_famHigh correlation
qtd_comodos_domic_fam is highly correlated with qtd_comodos_dormitorio_famHigh correlation
qtd_comodos_dormitorio_fam is highly correlated with qtd_comodos_domic_famHigh correlation
cod_material_domic_fam is highly correlated with cd_ibgeHigh correlation
cod_agua_canalizada_fam is highly correlated with cod_abaste_agua_domic_famHigh correlation
cod_abaste_agua_domic_fam is highly correlated with cod_agua_canalizada_famHigh correlation
cod_destino_lixo_domic_fam is highly correlated with cod_local_domic_famHigh correlation
cod_centro_assist_fam is highly correlated with cd_ibgeHigh correlation
marc_pbf is highly correlated with vlr_renda_media_famHigh correlation
cd_ibge is highly correlated with classf and 1 other fieldsHigh correlation
estrato is highly correlated with classfHigh correlation
classf is highly correlated with cd_ibge and 2 other fieldsHigh correlation
id_familia is highly correlated with classfHigh correlation
vlr_renda_media_fam is highly correlated with marc_pbfHigh correlation
cod_local_domic_fam is highly correlated with cod_destino_lixo_domic_famHigh correlation
qtd_comodos_domic_fam is highly correlated with qtd_comodos_dormitorio_famHigh correlation
qtd_comodos_dormitorio_fam is highly correlated with qtd_comodos_domic_famHigh correlation
cod_agua_canalizada_fam is highly correlated with cod_abaste_agua_domic_famHigh correlation
cod_abaste_agua_domic_fam is highly correlated with cod_agua_canalizada_famHigh correlation
cod_destino_lixo_domic_fam is highly correlated with cod_local_domic_famHigh correlation
cod_centro_assist_fam is highly correlated with cd_ibge and 1 other fieldsHigh correlation
marc_pbf is highly correlated with vlr_renda_media_famHigh correlation
peso.fam is highly correlated with cod_centro_assist_famHigh correlation
cd_ibge is highly correlated with classf and 1 other fieldsHigh correlation
estrato is highly correlated with classfHigh correlation
classf is highly correlated with cd_ibge and 2 other fieldsHigh correlation
id_familia is highly correlated with classfHigh correlation
vlr_renda_media_fam is highly correlated with marc_pbfHigh correlation
cod_local_domic_fam is highly correlated with cod_destino_lixo_domic_famHigh correlation
qtd_comodos_domic_fam is highly correlated with qtd_comodos_dormitorio_famHigh correlation
qtd_comodos_dormitorio_fam is highly correlated with qtd_comodos_domic_famHigh correlation
cod_agua_canalizada_fam is highly correlated with cod_abaste_agua_domic_famHigh correlation
cod_abaste_agua_domic_fam is highly correlated with cod_agua_canalizada_famHigh correlation
cod_destino_lixo_domic_fam is highly correlated with cod_local_domic_famHigh correlation
cod_centro_assist_fam is highly correlated with cd_ibgeHigh correlation
marc_pbf is highly correlated with vlr_renda_media_famHigh correlation
classf is highly correlated with estratoHigh correlation
estrato is highly correlated with classfHigh correlation
cod_familia_indigena_fam is highly correlated with ind_familia_quilombola_famHigh correlation
cod_abaste_agua_domic_fam is highly correlated with cod_agua_canalizada_fam and 2 other fieldsHigh correlation
cod_agua_canalizada_fam is highly correlated with cod_abaste_agua_domic_fam and 1 other fieldsHigh correlation
cod_calcamento_domic_fam is highly correlated with cod_especie_domic_famHigh correlation
cod_local_domic_fam is highly correlated with cod_abaste_agua_domic_famHigh correlation
cod_banheiro_domic_fam is highly correlated with cod_especie_domic_famHigh correlation
ind_familia_quilombola_fam is highly correlated with cod_familia_indigena_famHigh correlation
cod_especie_domic_fam is highly correlated with cod_abaste_agua_domic_fam and 3 other fieldsHigh correlation
cd_ibge is highly correlated with classf and 4 other fieldsHigh correlation
estrato is highly correlated with classf and 2 other fieldsHigh correlation
classf is highly correlated with cd_ibge and 4 other fieldsHigh correlation
id_familia is highly correlated with cd_ibge and 4 other fieldsHigh correlation
vlr_renda_media_fam is highly correlated with marc_pbfHigh correlation
cod_local_domic_fam is highly correlated with cod_agua_canalizada_fam and 2 other fieldsHigh correlation
qtd_comodos_domic_fam is highly correlated with qtd_comodos_dormitorio_famHigh correlation
qtd_comodos_dormitorio_fam is highly correlated with qtd_comodos_domic_famHigh correlation
cod_material_piso_fam is highly correlated with cd_ibge and 2 other fieldsHigh correlation
cod_material_domic_fam is highly correlated with classf and 2 other fieldsHigh correlation
cod_agua_canalizada_fam is highly correlated with cod_local_domic_fam and 3 other fieldsHigh correlation
cod_abaste_agua_domic_fam is highly correlated with cod_local_domic_fam and 1 other fieldsHigh correlation
cod_banheiro_domic_fam is highly correlated with cod_material_domic_fam and 2 other fieldsHigh correlation
cod_destino_lixo_domic_fam is highly correlated with cod_local_domic_fam and 3 other fieldsHigh correlation
cod_calcamento_domic_fam is highly correlated with cod_destino_lixo_domic_famHigh correlation
cod_centro_assist_fam is highly correlated with cd_ibge and 5 other fieldsHigh correlation
marc_pbf is highly correlated with vlr_renda_media_famHigh correlation
peso.fam is highly correlated with cd_ibge and 2 other fieldsHigh correlation
qtd_comodos_domic_fam has 3279 (8.6%) missing values Missing
qtd_comodos_dormitorio_fam has 3277 (8.6%) missing values Missing
cod_material_piso_fam has 3270 (8.5%) missing values Missing
cod_material_domic_fam has 3270 (8.5%) missing values Missing
cod_agua_canalizada_fam has 3270 (8.5%) missing values Missing
cod_abaste_agua_domic_fam has 3270 (8.5%) missing values Missing
cod_banheiro_domic_fam has 3270 (8.5%) missing values Missing
cod_escoa_sanitario_domic_fam has 7169 (18.7%) missing values Missing
cod_destino_lixo_domic_fam has 3270 (8.5%) missing values Missing
cod_iluminacao_domic_fam has 3270 (8.5%) missing values Missing
cod_calcamento_domic_fam has 3270 (8.5%) missing values Missing
ind_familia_quilombola_fam has 1993 (5.2%) missing values Missing
nom_estab_assist_saude_fam has 22940 (60.0%) missing values Missing
cod_eas_fam has 22940 (60.0%) missing values Missing
nom_centro_assist_fam has 27904 (73.0%) missing values Missing
cod_centro_assist_fam has 27904 (73.0%) missing values Missing
ind_parc_mds_fam has 1009 (2.6%) missing values Missing
id_familia has unique values Unique
vlr_renda_media_fam has 4939 (12.9%) zeros Zeros
ind_parc_mds_fam has 33420 (87.4%) zeros Zeros

Reproduction

Analysis started2023-03-24 16:02:54.422080
Analysis finished2023-03-24 16:03:51.063731
Duration56.64 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

cd_ibge
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct210
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2211792.444
Minimum1200013
Maximum5108402
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size298.9 KiB
2023-03-24T16:03:51.284348image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1200013
5-th percentile1200708
Q11400456
median2111201
Q32603454
95-th percentile5108402
Maximum5108402
Range3908389
Interquartile range (IQR)1202998

Descriptive statistics

Standard deviation1022750.504
Coefficient of variation (CV)0.4624079929
Kurtosis2.655533582
Mean2211792.444
Median Absolute Deviation (MAD)591402
Skewness1.737035241
Sum8.459663739 × 1010
Variance1.046018594 × 1012
MonotonicityNot monotonic
2023-03-24T16:03:51.582704image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
51084021926
 
5.0%
1600279933
 
2.4%
2700300908
 
2.4%
2105302875
 
2.3%
1400209833
 
2.2%
1400308761
 
2.0%
1400472756
 
2.0%
2111201673
 
1.8%
2211100666
 
1.7%
1600501628
 
1.6%
Other values (200)29289
76.6%
ValueCountFrequency (%)
120001380
 
0.2%
120005448
 
0.1%
120010484
 
0.2%
120013847
 
0.1%
120017957
 
0.1%
1200203305
0.8%
120025272
 
0.2%
1200302109
 
0.3%
120032835
 
0.1%
1200336121
 
0.3%
ValueCountFrequency (%)
51084021926
5.0%
5107701194
 
0.5%
5106505277
 
0.7%
510645549
 
0.1%
510620866
 
0.2%
5106109138
 
0.4%
5105903162
 
0.4%
510490683
 
0.2%
5103007171
 
0.4%
510160599
 
0.3%

estrato
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size298.9 KiB
2
22454 
1
15794 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters38248
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
222454
58.7%
115794
41.3%

Length

2023-03-24T16:03:51.957205image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-03-24T16:03:52.287537image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
222454
58.7%
115794
41.3%

Most occurring characters

ValueCountFrequency (%)
222454
58.7%
115794
41.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number38248
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
222454
58.7%
115794
41.3%

Most occurring scripts

ValueCountFrequency (%)
Common38248
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
222454
58.7%
115794
41.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII38248
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
222454
58.7%
115794
41.3%

classf
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size298.9 KiB
2
25033 
3
13215 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters38248
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
225033
65.4%
313215
34.6%

Length

2023-03-24T16:03:52.468816image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-03-24T16:03:52.787318image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
225033
65.4%
313215
34.6%

Most occurring characters

ValueCountFrequency (%)
225033
65.4%
313215
34.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number38248
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
225033
65.4%
313215
34.6%

Most occurring scripts

ValueCountFrequency (%)
Common38248
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
225033
65.4%
313215
34.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII38248
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
225033
65.4%
313215
34.6%

id_familia
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIQUE

Distinct38248
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23447.90928
Minimum1
Maximum45777
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size298.9 KiB
2023-03-24T16:03:52.969571image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2452.35
Q112145.75
median23751
Q335154.25
95-th percentile43488.65
Maximum45777
Range45776
Interquartile range (IQR)23008.5

Descriptive statistics

Standard deviation13213.84825
Coefficient of variation (CV)0.5635405741
Kurtosis-1.20130851
Mean23447.90928
Median Absolute Deviation (MAD)11499
Skewness-0.05099880456
Sum896835634
Variance174605785.7
MonotonicityStrictly increasing
2023-03-24T16:03:53.295945image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11
 
< 0.1%
310881
 
< 0.1%
310901
 
< 0.1%
310911
 
< 0.1%
310921
 
< 0.1%
310931
 
< 0.1%
310941
 
< 0.1%
310951
 
< 0.1%
310971
 
< 0.1%
310981
 
< 0.1%
Other values (38238)38238
> 99.9%
ValueCountFrequency (%)
11
< 0.1%
31
< 0.1%
41
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
101
< 0.1%
111
< 0.1%
121
< 0.1%
ValueCountFrequency (%)
457771
< 0.1%
457761
< 0.1%
457751
< 0.1%
457731
< 0.1%
457721
< 0.1%
457711
< 0.1%
457701
< 0.1%
457681
< 0.1%
457671
< 0.1%
457661
< 0.1%

dat_cadastramento_fam
Categorical

HIGH CARDINALITY

Distinct3487
Distinct (%)9.1%
Missing0
Missing (%)0.0%
Memory size298.9 KiB
2003-03-13
 
1270
2006-04-08
 
119
2006-04-14
 
110
2006-04-01
 
104
2002-08-18
 
97
Other values (3482)
36548 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters382480
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique261 ?
Unique (%)0.7%

Sample

1st row2018-06-28
2nd row2018-08-27
3rd row2018-02-23
4th row2013-12-27
5th row2018-03-26

Common Values

ValueCountFrequency (%)
2003-03-131270
 
3.3%
2006-04-08119
 
0.3%
2006-04-14110
 
0.3%
2006-04-01104
 
0.3%
2002-08-1897
 
0.3%
2002-07-2192
 
0.2%
2006-08-1990
 
0.2%
2002-07-0190
 
0.2%
2002-09-1584
 
0.2%
2002-07-2383
 
0.2%
Other values (3477)36109
94.4%

Length

2023-03-24T16:03:53.647986image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2003-03-131270
 
3.3%
2006-04-08119
 
0.3%
2006-04-14110
 
0.3%
2006-04-01104
 
0.3%
2002-08-1897
 
0.3%
2002-07-2192
 
0.2%
2006-08-1990
 
0.2%
2002-07-0190
 
0.2%
2002-09-1584
 
0.2%
2002-07-2383
 
0.2%
Other values (3477)36109
94.4%

Most occurring characters

ValueCountFrequency (%)
0100163
26.2%
-76496
20.0%
264283
16.8%
158521
15.3%
814947
 
3.9%
314868
 
3.9%
713440
 
3.5%
611991
 
3.1%
49837
 
2.6%
59505
 
2.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number305984
80.0%
Dash Punctuation76496
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0100163
32.7%
264283
21.0%
158521
19.1%
814947
 
4.9%
314868
 
4.9%
713440
 
4.4%
611991
 
3.9%
49837
 
3.2%
59505
 
3.1%
98429
 
2.8%
Dash Punctuation
ValueCountFrequency (%)
-76496
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common382480
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0100163
26.2%
-76496
20.0%
264283
16.8%
158521
15.3%
814947
 
3.9%
314868
 
3.9%
713440
 
3.5%
611991
 
3.1%
49837
 
2.6%
59505
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII382480
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0100163
26.2%
-76496
20.0%
264283
16.8%
158521
15.3%
814947
 
3.9%
314868
 
3.9%
713440
 
3.5%
611991
 
3.1%
49837
 
2.6%
59505
 
2.5%

dat_alteracao_fam
Categorical

HIGH CARDINALITY

Distinct962
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Memory size298.9 KiB
2018-09-30
13535 
2018-10-01
11391 
2018-10-02
1684 
2018-09-25
 
178
2018-11-28
 
159
Other values (957)
11301 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters382480
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique138 ?
Unique (%)0.4%

Sample

1st row2018-10-02
2nd row2018-11-29
3rd row2018-02-27
4th row2018-10-01
5th row2018-03-28

Common Values

ValueCountFrequency (%)
2018-09-3013535
35.4%
2018-10-0111391
29.8%
2018-10-021684
 
4.4%
2018-09-25178
 
0.5%
2018-11-28159
 
0.4%
2018-11-13149
 
0.4%
2018-11-27148
 
0.4%
2018-09-27147
 
0.4%
2018-10-30147
 
0.4%
2018-12-04145
 
0.4%
Other values (952)10565
27.6%

Length

2023-03-24T16:03:53.852150image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2018-09-3013535
35.4%
2018-10-0111391
29.8%
2018-10-021684
 
4.4%
2018-09-25178
 
0.5%
2018-11-28159
 
0.4%
2018-11-13149
 
0.4%
2018-11-27148
 
0.4%
2018-09-27147
 
0.4%
2018-10-30147
 
0.4%
2018-12-04145
 
0.4%
Other values (952)10565
27.6%

Most occurring characters

ValueCountFrequency (%)
0102818
26.9%
176845
20.1%
-76496
20.0%
247622
12.5%
836277
 
9.5%
316037
 
4.2%
915985
 
4.2%
73429
 
0.9%
62956
 
0.8%
52369
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number305984
80.0%
Dash Punctuation76496
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0102818
33.6%
176845
25.1%
247622
15.6%
836277
 
11.9%
316037
 
5.2%
915985
 
5.2%
73429
 
1.1%
62956
 
1.0%
52369
 
0.8%
41646
 
0.5%
Dash Punctuation
ValueCountFrequency (%)
-76496
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common382480
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0102818
26.9%
176845
20.1%
-76496
20.0%
247622
12.5%
836277
 
9.5%
316037
 
4.2%
915985
 
4.2%
73429
 
0.9%
62956
 
0.8%
52369
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII382480
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0102818
26.9%
176845
20.1%
-76496
20.0%
247622
12.5%
836277
 
9.5%
316037
 
4.2%
915985
 
4.2%
73429
 
0.9%
62956
 
0.8%
52369
 
0.6%

vlr_renda_media_fam
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct954
Distinct (%)2.5%
Missing5
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean209.0889836
Minimum0
Maximum2811
Zeros4939
Zeros (%)12.9%
Negative0
Negative (%)0.0%
Memory size298.9 KiB
2023-03-24T16:03:54.037854image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q116
median62
Q3240
95-th percentile954
Maximum2811
Range2811
Interquartile range (IQR)224

Descriptive statistics

Standard deviation317.8427488
Coefficient of variation (CV)1.520131493
Kurtosis4.719865218
Mean209.0889836
Median Absolute Deviation (MAD)58
Skewness2.084783446
Sum7996190
Variance101024.013
MonotonicityNot monotonic
2023-03-24T16:03:54.251667image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
04939
 
12.9%
502094
 
5.5%
9541466
 
3.8%
9371424
 
3.7%
25961
 
2.5%
75938
 
2.5%
100936
 
2.4%
33815
 
2.1%
66809
 
2.1%
16699
 
1.8%
Other values (944)23162
60.6%
ValueCountFrequency (%)
04939
12.9%
1140
 
0.4%
2313
 
0.8%
3208
 
0.5%
4426
 
1.1%
5371
 
1.0%
6401
 
1.0%
7153
 
0.4%
8537
 
1.4%
975
 
0.2%
ValueCountFrequency (%)
28111
 
< 0.1%
28003
< 0.1%
27271
 
< 0.1%
26331
 
< 0.1%
26001
 
< 0.1%
25501
 
< 0.1%
25005
< 0.1%
24002
 
< 0.1%
23791
 
< 0.1%
23111
 
< 0.1%

dat_atualizacao_familia
Categorical

HIGH CARDINALITY

Distinct1088
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Memory size298.9 KiB
2018-09-11
 
147
2018-09-04
 
146
2018-09-12
 
146
2018-11-28
 
146
2018-11-13
 
143
Other values (1083)
37520 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters382480
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique65 ?
Unique (%)0.2%

Sample

1st row2018-06-28
2nd row2018-11-29
3rd row2018-02-23
4th row2017-06-22
5th row2018-03-26

Common Values

ValueCountFrequency (%)
2018-09-11147
 
0.4%
2018-09-04146
 
0.4%
2018-09-12146
 
0.4%
2018-11-28146
 
0.4%
2018-11-13143
 
0.4%
2018-06-05140
 
0.4%
2018-09-14137
 
0.4%
2018-10-30135
 
0.4%
2018-12-04135
 
0.4%
2018-09-10134
 
0.4%
Other values (1078)36839
96.3%

Length

2023-03-24T16:03:54.451420image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2018-09-11147
 
0.4%
2018-11-28146
 
0.4%
2018-09-04146
 
0.4%
2018-09-12146
 
0.4%
2018-11-13143
 
0.4%
2018-06-05140
 
0.4%
2018-09-14137
 
0.4%
2018-10-30135
 
0.4%
2018-12-04135
 
0.4%
2018-09-10134
 
0.4%
Other values (1078)36839
96.3%

Most occurring characters

ValueCountFrequency (%)
084093
22.0%
-76496
20.0%
171987
18.8%
259532
15.6%
829076
 
7.6%
718489
 
4.8%
611424
 
3.0%
38890
 
2.3%
58104
 
2.1%
97462
 
2.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number305984
80.0%
Dash Punctuation76496
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
084093
27.5%
171987
23.5%
259532
19.5%
829076
 
9.5%
718489
 
6.0%
611424
 
3.7%
38890
 
2.9%
58104
 
2.6%
97462
 
2.4%
46927
 
2.3%
Dash Punctuation
ValueCountFrequency (%)
-76496
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common382480
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
084093
22.0%
-76496
20.0%
171987
18.8%
259532
15.6%
829076
 
7.6%
718489
 
4.8%
611424
 
3.0%
38890
 
2.3%
58104
 
2.1%
97462
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII382480
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
084093
22.0%
-76496
20.0%
171987
18.8%
259532
15.6%
829076
 
7.6%
718489
 
4.8%
611424
 
3.0%
38890
 
2.3%
58104
 
2.1%
97462
 
2.0%

cod_local_domic_fam
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing40
Missing (%)0.1%
Memory size298.9 KiB
1.0
25891 
2.0
12317 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters114624
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.025891
67.7%
2.012317
32.2%
(Missing)40
 
0.1%

Length

2023-03-24T16:03:54.640533image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-03-24T16:03:54.830959image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
1.025891
67.8%
2.012317
32.2%

Most occurring characters

ValueCountFrequency (%)
.38208
33.3%
038208
33.3%
125891
22.6%
212317
 
10.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number76416
66.7%
Other Punctuation38208
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
038208
50.0%
125891
33.9%
212317
 
16.1%
Other Punctuation
ValueCountFrequency (%)
.38208
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common114624
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.38208
33.3%
038208
33.3%
125891
22.6%
212317
 
10.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII114624
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.38208
33.3%
038208
33.3%
125891
22.6%
212317
 
10.7%

cod_especie_domic_fam
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing40
Missing (%)0.1%
Memory size298.9 KiB
1.0
34978 
2.0
 
2890
3.0
 
340

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters114624
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.034978
91.5%
2.02890
 
7.6%
3.0340
 
0.9%
(Missing)40
 
0.1%

Length

2023-03-24T16:03:54.981233image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-03-24T16:03:55.172964image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
1.034978
91.5%
2.02890
 
7.6%
3.0340
 
0.9%

Most occurring characters

ValueCountFrequency (%)
.38208
33.3%
038208
33.3%
134978
30.5%
22890
 
2.5%
3340
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number76416
66.7%
Other Punctuation38208
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
038208
50.0%
134978
45.8%
22890
 
3.8%
3340
 
0.4%
Other Punctuation
ValueCountFrequency (%)
.38208
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common114624
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.38208
33.3%
038208
33.3%
134978
30.5%
22890
 
2.5%
3340
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII114624
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.38208
33.3%
038208
33.3%
134978
30.5%
22890
 
2.5%
3340
 
0.3%

qtd_comodos_domic_fam
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct15
Distinct (%)< 0.1%
Missing3279
Missing (%)8.6%
Infinite0
Infinite (%)0.0%
Mean4.077611599
Minimum0
Maximum15
Zeros3
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size298.9 KiB
2023-03-24T16:03:55.317529image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q13
median4
Q35
95-th percentile6
Maximum15
Range15
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.452769763
Coefficient of variation (CV)0.3562795836
Kurtosis0.5177686452
Mean4.077611599
Median Absolute Deviation (MAD)1
Skewness-0.06023559091
Sum142590
Variance2.110539983
MonotonicityNot monotonic
2023-03-24T16:03:55.483594image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
510704
28.0%
49217
24.1%
35661
14.8%
23275
 
8.6%
63020
 
7.9%
11942
 
5.1%
7744
 
1.9%
8293
 
0.8%
971
 
0.2%
1024
 
0.1%
Other values (5)18
 
< 0.1%
(Missing)3279
 
8.6%
ValueCountFrequency (%)
03
 
< 0.1%
11942
 
5.1%
23275
 
8.6%
35661
14.8%
49217
24.1%
510704
28.0%
63020
 
7.9%
7744
 
1.9%
8293
 
0.8%
971
 
0.2%
ValueCountFrequency (%)
151
 
< 0.1%
141
 
< 0.1%
124
 
< 0.1%
119
 
< 0.1%
1024
 
0.1%
971
 
0.2%
8293
 
0.8%
7744
 
1.9%
63020
 
7.9%
510704
28.0%

qtd_comodos_dormitorio_fam
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct10
Distinct (%)< 0.1%
Missing3277
Missing (%)8.6%
Infinite0
Infinite (%)0.0%
Mean1.77790169
Minimum0
Maximum12
Zeros12
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size298.9 KiB
2023-03-24T16:03:55.654856image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median2
Q32
95-th percentile3
Maximum12
Range12
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.736560308
Coefficient of variation (CV)0.4142862973
Kurtosis3.75611111
Mean1.77790169
Median Absolute Deviation (MAD)1
Skewness1.036631206
Sum62175
Variance0.5425210874
MonotonicityNot monotonic
2023-03-24T16:03:55.830990image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
217265
45.1%
113124
34.3%
33922
 
10.3%
4535
 
1.4%
580
 
0.2%
627
 
0.1%
012
 
< 0.1%
73
 
< 0.1%
122
 
< 0.1%
81
 
< 0.1%
(Missing)3277
 
8.6%
ValueCountFrequency (%)
012
 
< 0.1%
113124
34.3%
217265
45.1%
33922
 
10.3%
4535
 
1.4%
580
 
0.2%
627
 
0.1%
73
 
< 0.1%
81
 
< 0.1%
122
 
< 0.1%
ValueCountFrequency (%)
122
 
< 0.1%
81
 
< 0.1%
73
 
< 0.1%
627
 
0.1%
580
 
0.2%
4535
 
1.4%
33922
 
10.3%
217265
45.1%
113124
34.3%
012
 
< 0.1%

cod_material_piso_fam
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct7
Distinct (%)< 0.1%
Missing3270
Missing (%)8.5%
Infinite0
Infinite (%)0.0%
Mean2.965149523
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size298.9 KiB
2023-03-24T16:03:55.996276image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median2
Q35
95-th percentile5
Maximum7
Range6
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.437935757
Coefficient of variation (CV)0.4849454457
Kurtosis-1.266285386
Mean2.965149523
Median Absolute Deviation (MAD)0
Skewness0.5176421145
Sum103715
Variance2.067659241
MonotonicityNot monotonic
2023-03-24T16:03:56.154214image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
218175
47.5%
59268
24.2%
43454
 
9.0%
12820
 
7.4%
31106
 
2.9%
7141
 
0.4%
614
 
< 0.1%
(Missing)3270
 
8.5%
ValueCountFrequency (%)
12820
 
7.4%
218175
47.5%
31106
 
2.9%
43454
 
9.0%
59268
24.2%
614
 
< 0.1%
7141
 
0.4%
ValueCountFrequency (%)
7141
 
0.4%
614
 
< 0.1%
59268
24.2%
43454
 
9.0%
31106
 
2.9%
218175
47.5%
12820
 
7.4%

cod_material_domic_fam
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct8
Distinct (%)< 0.1%
Missing3270
Missing (%)8.5%
Infinite0
Infinite (%)0.0%
Mean2.053891017
Minimum1
Maximum8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size298.9 KiB
2023-03-24T16:03:56.320534image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q33
95-th percentile6
Maximum8
Range7
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.501592254
Coefficient of variation (CV)0.7310963634
Kurtosis3.061072785
Mean2.053891017
Median Absolute Deviation (MAD)0
Skewness1.764081546
Sum71841
Variance2.254779296
MonotonicityNot monotonic
2023-03-24T16:03:56.487401image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
118462
48.3%
36692
 
17.5%
25974
 
15.6%
51377
 
3.6%
61163
 
3.0%
4710
 
1.9%
8452
 
1.2%
7148
 
0.4%
(Missing)3270
 
8.5%
ValueCountFrequency (%)
118462
48.3%
25974
 
15.6%
36692
 
17.5%
4710
 
1.9%
51377
 
3.6%
61163
 
3.0%
7148
 
0.4%
8452
 
1.2%
ValueCountFrequency (%)
8452
 
1.2%
7148
 
0.4%
61163
 
3.0%
51377
 
3.6%
4710
 
1.9%
36692
 
17.5%
25974
 
15.6%
118462
48.3%

cod_agua_canalizada_fam
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct2
Distinct (%)< 0.1%
Missing3270
Missing (%)8.5%
Memory size298.9 KiB
1.0
27752 
2.0
7226 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters104934
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row2.0

Common Values

ValueCountFrequency (%)
1.027752
72.6%
2.07226
 
18.9%
(Missing)3270
 
8.5%

Length

2023-03-24T16:03:56.671026image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-03-24T16:03:56.857922image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
1.027752
79.3%
2.07226
 
20.7%

Most occurring characters

ValueCountFrequency (%)
.34978
33.3%
034978
33.3%
127752
26.4%
27226
 
6.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number69956
66.7%
Other Punctuation34978
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
034978
50.0%
127752
39.7%
27226
 
10.3%
Other Punctuation
ValueCountFrequency (%)
.34978
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common104934
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.34978
33.3%
034978
33.3%
127752
26.4%
27226
 
6.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII104934
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.34978
33.3%
034978
33.3%
127752
26.4%
27226
 
6.9%

cod_abaste_agua_domic_fam
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct4
Distinct (%)< 0.1%
Missing3270
Missing (%)8.5%
Memory size298.9 KiB
1.0
22453 
2.0
10364 
4.0
 
1770
3.0
 
391

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters104934
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row4.0

Common Values

ValueCountFrequency (%)
1.022453
58.7%
2.010364
27.1%
4.01770
 
4.6%
3.0391
 
1.0%
(Missing)3270
 
8.5%

Length

2023-03-24T16:03:56.999152image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-03-24T16:03:57.193134image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
1.022453
64.2%
2.010364
29.6%
4.01770
 
5.1%
3.0391
 
1.1%

Most occurring characters

ValueCountFrequency (%)
.34978
33.3%
034978
33.3%
122453
21.4%
210364
 
9.9%
41770
 
1.7%
3391
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number69956
66.7%
Other Punctuation34978
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
034978
50.0%
122453
32.1%
210364
 
14.8%
41770
 
2.5%
3391
 
0.6%
Other Punctuation
ValueCountFrequency (%)
.34978
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common104934
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.34978
33.3%
034978
33.3%
122453
21.4%
210364
 
9.9%
41770
 
1.7%
3391
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII104934
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.34978
33.3%
034978
33.3%
122453
21.4%
210364
 
9.9%
41770
 
1.7%
3391
 
0.4%

cod_banheiro_domic_fam
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct2
Distinct (%)< 0.1%
Missing3270
Missing (%)8.5%
Memory size298.9 KiB
1.0
31079 
2.0
3899 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters104934
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.031079
81.3%
2.03899
 
10.2%
(Missing)3270
 
8.5%

Length

2023-03-24T16:03:57.343051image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-03-24T16:03:57.545416image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
1.031079
88.9%
2.03899
 
11.1%

Most occurring characters

ValueCountFrequency (%)
.34978
33.3%
034978
33.3%
131079
29.6%
23899
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number69956
66.7%
Other Punctuation34978
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
034978
50.0%
131079
44.4%
23899
 
5.6%
Other Punctuation
ValueCountFrequency (%)
.34978
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common104934
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.34978
33.3%
034978
33.3%
131079
29.6%
23899
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII104934
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.34978
33.3%
034978
33.3%
131079
29.6%
23899
 
3.7%

cod_escoa_sanitario_domic_fam
Real number (ℝ≥0)

MISSING

Distinct6
Distinct (%)< 0.1%
Missing7169
Missing (%)18.7%
Infinite0
Infinite (%)0.0%
Mean2.435470897
Minimum1
Maximum6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size298.9 KiB
2023-03-24T16:03:57.673669image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median2
Q33
95-th percentile4
Maximum6
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.9397914433
Coefficient of variation (CV)0.385876688
Kurtosis1.401994582
Mean2.435470897
Median Absolute Deviation (MAD)1
Skewness0.5344110161
Sum75692
Variance0.8832079569
MonotonicityNot monotonic
2023-03-24T16:03:57.846168image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
313182
34.5%
210352
27.1%
15239
 
13.7%
41728
 
4.5%
6401
 
1.0%
5177
 
0.5%
(Missing)7169
18.7%
ValueCountFrequency (%)
15239
 
13.7%
210352
27.1%
313182
34.5%
41728
 
4.5%
5177
 
0.5%
6401
 
1.0%
ValueCountFrequency (%)
6401
 
1.0%
5177
 
0.5%
41728
 
4.5%
313182
34.5%
210352
27.1%
15239
 
13.7%

cod_destino_lixo_domic_fam
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct6
Distinct (%)< 0.1%
Missing3270
Missing (%)8.5%
Infinite0
Infinite (%)0.0%
Mean1.679455658
Minimum1
Maximum6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size298.9 KiB
2023-03-24T16:03:58.005912image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q33
95-th percentile3
Maximum6
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation0.9875291931
Coefficient of variation (CV)0.5880055175
Kurtosis0.7379363644
Mean1.679455658
Median Absolute Deviation (MAD)0
Skewness1.153628322
Sum58744
Variance0.9752139073
MonotonicityNot monotonic
2023-03-24T16:03:58.181490image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
122641
59.2%
39550
25.0%
22069
 
5.4%
4488
 
1.3%
6213
 
0.6%
517
 
< 0.1%
(Missing)3270
 
8.5%
ValueCountFrequency (%)
122641
59.2%
22069
 
5.4%
39550
25.0%
4488
 
1.3%
517
 
< 0.1%
6213
 
0.6%
ValueCountFrequency (%)
6213
 
0.6%
517
 
< 0.1%
4488
 
1.3%
39550
25.0%
22069
 
5.4%
122641
59.2%

cod_iluminacao_domic_fam
Real number (ℝ≥0)

MISSING

Distinct6
Distinct (%)< 0.1%
Missing3270
Missing (%)8.5%
Infinite0
Infinite (%)0.0%
Mean1.484676082
Minimum1
Maximum6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size298.9 KiB
2023-03-24T16:03:58.334606image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile4
Maximum6
Range5
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.141531021
Coefficient of variation (CV)0.7688754704
Kurtosis5.923495847
Mean1.484676082
Median Absolute Deviation (MAD)0
Skewness2.541389024
Sum51931
Variance1.303093072
MonotonicityNot monotonic
2023-03-24T16:03:59.105155image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
128313
74.0%
33316
 
8.7%
21095
 
2.9%
6951
 
2.5%
4741
 
1.9%
5562
 
1.5%
(Missing)3270
 
8.5%
ValueCountFrequency (%)
128313
74.0%
21095
 
2.9%
33316
 
8.7%
4741
 
1.9%
5562
 
1.5%
6951
 
2.5%
ValueCountFrequency (%)
6951
 
2.5%
5562
 
1.5%
4741
 
1.9%
33316
 
8.7%
21095
 
2.9%
128313
74.0%

cod_calcamento_domic_fam
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct3
Distinct (%)< 0.1%
Missing3270
Missing (%)8.5%
Memory size298.9 KiB
3.0
18835 
1.0
12937 
2.0
3206 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters104934
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row3.0
4th row3.0
5th row3.0

Common Values

ValueCountFrequency (%)
3.018835
49.2%
1.012937
33.8%
2.03206
 
8.4%
(Missing)3270
 
8.5%

Length

2023-03-24T16:03:59.297494image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-03-24T16:03:59.489999image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
3.018835
53.8%
1.012937
37.0%
2.03206
 
9.2%

Most occurring characters

ValueCountFrequency (%)
.34978
33.3%
034978
33.3%
318835
17.9%
112937
 
12.3%
23206
 
3.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number69956
66.7%
Other Punctuation34978
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
034978
50.0%
318835
26.9%
112937
 
18.5%
23206
 
4.6%
Other Punctuation
ValueCountFrequency (%)
.34978
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common104934
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.34978
33.3%
034978
33.3%
318835
17.9%
112937
 
12.3%
23206
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII104934
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.34978
33.3%
034978
33.3%
318835
17.9%
112937
 
12.3%
23206
 
3.1%

cod_familia_indigena_fam
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size298.9 KiB
2
36256 
1
 
1992

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters38248
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
236256
94.8%
11992
 
5.2%

Length

2023-03-24T16:03:59.643139image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-03-24T16:03:59.891114image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
236256
94.8%
11992
 
5.2%

Most occurring characters

ValueCountFrequency (%)
236256
94.8%
11992
 
5.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number38248
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
236256
94.8%
11992
 
5.2%

Most occurring scripts

ValueCountFrequency (%)
Common38248
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
236256
94.8%
11992
 
5.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII38248
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
236256
94.8%
11992
 
5.2%

ind_familia_quilombola_fam
Categorical

HIGH CORRELATION
MISSING

Distinct2
Distinct (%)< 0.1%
Missing1993
Missing (%)5.2%
Memory size298.9 KiB
2.0
35980 
1.0
 
275

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108765
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2.0
2nd row2.0
3rd row2.0
4th row2.0
5th row2.0

Common Values

ValueCountFrequency (%)
2.035980
94.1%
1.0275
 
0.7%
(Missing)1993
 
5.2%

Length

2023-03-24T16:04:00.026678image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-03-24T16:04:00.240543image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
2.035980
99.2%
1.0275
 
0.8%

Most occurring characters

ValueCountFrequency (%)
.36255
33.3%
036255
33.3%
235980
33.1%
1275
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72510
66.7%
Other Punctuation36255
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
036255
50.0%
235980
49.6%
1275
 
0.4%
Other Punctuation
ValueCountFrequency (%)
.36255
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108765
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.36255
33.3%
036255
33.3%
235980
33.1%
1275
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII108765
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.36255
33.3%
036255
33.3%
235980
33.1%
1275
 
0.3%

nom_estab_assist_saude_fam
Categorical

HIGH CARDINALITY
MISSING

Distinct1345
Distinct (%)8.8%
Missing22940
Missing (%)60.0%
Memory size298.9 KiB
CENTRO DE SAUDE I MUCAJAI
 
566
HOSP DE ALTOS INST DE SAUDE JOSE GIL BARBOSA
 
411
CENTRO DE SAUDE CLAITON O DA SILVA
 
267
PM TARTA UBS JOSE ALVES MEIRELES
 
249
POSTO DE SAUDE NELSON DIAS FERNANDES
 
197
Other values (1340)
13618 

Length

Max length60
Median length48
Mean length30.57656128
Min length5

Characters and Unicode

Total characters468066
Distinct characters37
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique296 ?
Unique (%)1.9%

Sample

1st rowUS CAMPO VERDE
2nd rowUNIDADE REGIONAL DE SAUDE SERRA
3rd rowUNIDADE BASICA DE SAUDE VILA NOVA DE COLARES
4th rowUNIDADE DE SAUDE DA FAMILIA DE ULISSES GUIMARAES
5th rowUNIDADE DE SAUDE DA FAMILIA DE TERRA VERMELHA

Common Values

ValueCountFrequency (%)
CENTRO DE SAUDE I MUCAJAI566
 
1.5%
HOSP DE ALTOS INST DE SAUDE JOSE GIL BARBOSA411
 
1.1%
CENTRO DE SAUDE CLAITON O DA SILVA267
 
0.7%
PM TARTA UBS JOSE ALVES MEIRELES249
 
0.7%
POSTO DE SAUDE NELSON DIAS FERNANDES197
 
0.5%
CENTRO DE SAUDE IRACEMA GALVAO155
 
0.4%
CENTRO DE SAUDE SEBASTIAO RODRIGUES SILVA153
 
0.4%
UNIDADE MISTA IRMA CAMILA144
 
0.4%
U B S LAGOA132
 
0.3%
CENTRO DE SAUDE CRISTINO JOSE DA SILVA131
 
0.3%
Other values (1335)12903
33.7%
(Missing)22940
60.0%

Length

2023-03-24T16:04:00.425078image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
de11475
 
13.4%
saude8802
 
10.3%
unidade4106
 
4.8%
centro3331
 
3.9%
da3258
 
3.8%
basica2219
 
2.6%
usf2042
 
2.4%
posto1603
 
1.9%
familia1555
 
1.8%
jose1402
 
1.6%
Other values (1390)45734
53.5%

Most occurring characters

ValueCountFrequency (%)
70219
15.0%
A60448
12.9%
E45724
9.8%
D40142
8.6%
S35117
 
7.5%
I31720
 
6.8%
O29097
 
6.2%
U23225
 
5.0%
N20279
 
4.3%
R19492
 
4.2%
Other values (27)92603
19.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter397573
84.9%
Space Separator70219
 
15.0%
Decimal Number274
 
0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A60448
15.2%
E45724
11.5%
D40142
10.1%
S35117
8.8%
I31720
8.0%
O29097
 
7.3%
U23225
 
5.8%
N20279
 
5.1%
R19492
 
4.9%
T14427
 
3.6%
Other values (16)77902
19.6%
Decimal Number
ValueCountFrequency (%)
155
20.1%
249
17.9%
045
16.4%
336
13.1%
531
11.3%
422
 
8.0%
813
 
4.7%
612
 
4.4%
98
 
2.9%
73
 
1.1%
Space Separator
ValueCountFrequency (%)
70219
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin397573
84.9%
Common70493
 
15.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
A60448
15.2%
E45724
11.5%
D40142
10.1%
S35117
8.8%
I31720
8.0%
O29097
 
7.3%
U23225
 
5.8%
N20279
 
5.1%
R19492
 
4.9%
T14427
 
3.6%
Other values (16)77902
19.6%
Common
ValueCountFrequency (%)
70219
99.6%
155
 
0.1%
249
 
0.1%
045
 
0.1%
336
 
0.1%
531
 
< 0.1%
422
 
< 0.1%
813
 
< 0.1%
612
 
< 0.1%
98
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII468066
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
70219
15.0%
A60448
12.9%
E45724
9.8%
D40142
8.6%
S35117
 
7.5%
I31720
 
6.8%
O29097
 
6.2%
U23225
 
5.0%
N20279
 
4.3%
R19492
 
4.2%
Other values (27)92603
19.8%

cod_eas_fam
Real number (ℝ≥0)

MISSING

Distinct1354
Distinct (%)8.8%
Missing22940
Missing (%)60.0%
Infinite0
Infinite (%)0.0%
Mean3242215.1
Minimum43
Maximum9518517
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size298.9 KiB
2023-03-24T16:04:00.654610image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum43
5-th percentile2002906
Q12320029
median2458659
Q33576620
95-th percentile6964842
Maximum9518517
Range9518474
Interquartile range (IQR)1256591

Descriptive statistics

Standard deviation1677294.314
Coefficient of variation (CV)0.517329746
Kurtosis0.7959138076
Mean3242215.1
Median Absolute Deviation (MAD)438192
Skewness1.394578715
Sum4.963182876 × 1010
Variance2.813316217 × 1012
MonotonicityNot monotonic
2023-03-24T16:04:00.863616image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2320894566
 
1.5%
2324075411
 
1.1%
2589834267
 
0.7%
2020467249
 
0.7%
5412161197
 
0.5%
2320401155
 
0.4%
2320207153
 
0.4%
2320762144
 
0.4%
2367068132
 
0.3%
2320053131
 
0.3%
Other values (1344)12903
33.7%
(Missing)22940
60.0%
ValueCountFrequency (%)
436
< 0.1%
511
 
< 0.1%
1161
 
< 0.1%
1401
 
< 0.1%
1591
 
< 0.1%
1752
 
< 0.1%
2482
 
< 0.1%
2722
 
< 0.1%
2802
 
< 0.1%
3021
 
< 0.1%
ValueCountFrequency (%)
95185172
 
< 0.1%
93566061
 
< 0.1%
93565687
< 0.1%
935647910
< 0.1%
93457361
 
< 0.1%
93456981
 
< 0.1%
93456711
 
< 0.1%
93456631
 
< 0.1%
93421921
 
< 0.1%
93089035
< 0.1%

nom_centro_assist_fam
Categorical

HIGH CARDINALITY
MISSING

Distinct222
Distinct (%)2.1%
Missing27904
Missing (%)73.0%
Memory size298.9 KiB
CRAS CENTRO DE REFERENCIA DE ASSISTENCIA SOCIAL
1087 
CRAS CENTRO DE REFERENCIA DA ASSISTENCIA SOCIAL
825 
CRAS CENTRO DE REFERENCIA DE ASSISTENCIA SOCIAL CASA DA FAMILIA
 
589
CENTRO DE REFERENCIA DE ASSISTENCIA SOCIAL
 
511
CRAS NOSSA SENHORA DE FATIMA
 
489
Other values (217)
6843 

Length

Max length70
Median length62
Mean length32.40970611
Min length4

Characters and Unicode

Total characters335246
Distinct characters35
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique26 ?
Unique (%)0.3%

Sample

1st rowCRAS DE SERRA SEDE
2nd rowCRAS VIANA
3rd rowCRAS IV ALTO MUCURI
4th rowCRAS III CAMPO VERDE
5th rowCRAS DE VILA NOVA DE COLARES

Common Values

ValueCountFrequency (%)
CRAS CENTRO DE REFERENCIA DE ASSISTENCIA SOCIAL1087
 
2.8%
CRAS CENTRO DE REFERENCIA DA ASSISTENCIA SOCIAL825
 
2.2%
CRAS CENTRO DE REFERENCIA DE ASSISTENCIA SOCIAL CASA DA FAMILIA589
 
1.5%
CENTRO DE REFERENCIA DE ASSISTENCIA SOCIAL511
 
1.3%
CRAS NOSSA SENHORA DE FATIMA489
 
1.3%
CRAS SAO JOSE485
 
1.3%
CENTRO DE REFERENCIA DA ASSISTENCIA SOCIAL CASA DAS FAMILIAS403
 
1.1%
CAROEBE334
 
0.9%
CRAS VOVO JULIETA302
 
0.8%
CRAS EDSON DA MOTA CORREA237
 
0.6%
Other values (212)5082
 
13.3%
(Missing)27904
73.0%

Length

2023-03-24T16:04:01.102541image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
cras8778
16.3%
de7977
14.8%
centro4240
 
7.9%
referencia4211
 
7.8%
social4211
 
7.8%
assistencia4103
 
7.6%
da2718
 
5.0%
casa1047
 
1.9%
sao701
 
1.3%
familia642
 
1.2%
Other values (331)15287
28.4%

Most occurring characters

ValueCountFrequency (%)
A46694
13.9%
43571
13.0%
E37652
11.2%
S33340
9.9%
C29870
8.9%
R28358
8.5%
I25423
7.6%
O17734
 
5.3%
N16917
 
5.0%
D13474
 
4.0%
Other values (25)42213
12.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter291523
87.0%
Space Separator43571
 
13.0%
Decimal Number152
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A46694
16.0%
E37652
12.9%
S33340
11.4%
C29870
10.2%
R28358
9.7%
I25423
8.7%
O17734
 
6.1%
N16917
 
5.8%
D13474
 
4.6%
T11954
 
4.1%
Other values (14)30107
10.3%
Decimal Number
ValueCountFrequency (%)
076
50.0%
419
 
12.5%
113
 
8.6%
210
 
6.6%
98
 
5.3%
57
 
4.6%
66
 
3.9%
75
 
3.3%
35
 
3.3%
83
 
2.0%
Space Separator
ValueCountFrequency (%)
43571
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin291523
87.0%
Common43723
 
13.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A46694
16.0%
E37652
12.9%
S33340
11.4%
C29870
10.2%
R28358
9.7%
I25423
8.7%
O17734
 
6.1%
N16917
 
5.8%
D13474
 
4.6%
T11954
 
4.1%
Other values (14)30107
10.3%
Common
ValueCountFrequency (%)
43571
99.7%
076
 
0.2%
419
 
< 0.1%
113
 
< 0.1%
210
 
< 0.1%
98
 
< 0.1%
57
 
< 0.1%
66
 
< 0.1%
75
 
< 0.1%
35
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII335246
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A46694
13.9%
43571
13.0%
E37652
11.2%
S33340
9.9%
C29870
8.9%
R28358
8.5%
I25423
7.6%
O17734
 
5.3%
N16917
 
5.0%
D13474
 
4.0%
Other values (25)42213
12.6%

cod_centro_assist_fam
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct247
Distinct (%)2.4%
Missing27904
Missing (%)73.0%
Infinite0
Infinite (%)0.0%
Mean1.994222237 × 1010
Minimum1.120080282 × 1010
Maximum5.108402069 × 1010
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size298.9 KiB
2023-03-24T16:04:01.328900image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1.120080282 × 1010
5-th percentile1.301222953 × 1010
Q11.400150105 × 1010
median1.502400647 × 1010
Q32.30765358 × 1010
95-th percentile5.108400112 × 1010
Maximum5.108402069 × 1010
Range3.988321787 × 1010
Interquartile range (IQR)9075034755

Descriptive statistics

Standard deviation9295916322
Coefficient of variation (CV)0.4661424465
Kurtosis4.300190288
Mean1.994222237 × 1010
Median Absolute Deviation (MAD)2005003718
Skewness2.049838742
Sum2.062823482 × 1014
Variance8.641406027 × 1019
MonotonicityNot monotonic
2023-03-24T16:04:01.559259image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.400300625 × 1010694
 
1.8%
1.400170104 × 1010582
 
1.5%
1.400150105 × 1010511
 
1.3%
1.600503232 × 1010489
 
1.3%
1.302500225 × 1010485
 
1.3%
2.200400218 × 1010403
 
1.1%
1.400230226 × 1010334
 
0.9%
1.400021491 × 1010302
 
0.8%
1.301900275 × 1010285
 
0.7%
2.30370012 × 1010237
 
0.6%
Other values (237)6022
 
15.7%
(Missing)27904
73.0%
ValueCountFrequency (%)
1.120080282 × 101045
0.1%
1.140045147 × 10102
 
< 0.1%
1.150240111 × 10103
 
< 0.1%
1.160050106 × 101037
0.1%
1.20005047 × 101011
 
< 0.1%
1.200170217 × 10106
 
< 0.1%
1.20025016 × 10106
 
< 0.1%
1.200340167 × 10109
 
< 0.1%
1.200393181 × 101033
0.1%
1.200420167 × 10104
 
< 0.1%
ValueCountFrequency (%)
5.108402069 × 1010157
0.4%
5.108402045 × 1010135
0.4%
5.108400112 × 1010119
0.3%
5.108400112 × 1010169
0.4%
3.205209629 × 10101
 
< 0.1%
3.20520381 × 101015
 
< 0.1%
3.205201533 × 101014
 
< 0.1%
3.20520013 × 101014
 
< 0.1%
3.20520013 × 101010
 
< 0.1%
3.205200129 × 101020
 
0.1%

ind_parc_mds_fam
Real number (ℝ≥0)

MISSING
ZEROS

Distinct13
Distinct (%)< 0.1%
Missing1009
Missing (%)2.6%
Infinite0
Infinite (%)0.0%
Mean22.5494777
Minimum0
Maximum306
Zeros33420
Zeros (%)87.4%
Negative0
Negative (%)0.0%
Memory size298.9 KiB
2023-03-24T16:04:01.760116image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile205
Maximum306
Range306
Interquartile range (IQR)0

Descriptive statistics

Standard deviation67.72414781
Coefficient of variation (CV)3.003357715
Kurtosis6.286563866
Mean22.5494777
Median Absolute Deviation (MAD)0
Skewness2.792475051
Sum839720
Variance4586.560196
MonotonicityNot monotonic
2023-03-24T16:04:01.939921image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
033420
87.4%
2052081
 
5.4%
202609
 
1.6%
204416
 
1.1%
301402
 
1.1%
306113
 
0.3%
20184
 
0.2%
30359
 
0.2%
30420
 
0.1%
30515
 
< 0.1%
Other values (3)20
 
0.1%
(Missing)1009
 
2.6%
ValueCountFrequency (%)
033420
87.4%
1017
 
< 0.1%
20184
 
0.2%
202609
 
1.6%
2034
 
< 0.1%
204416
 
1.1%
2052081
 
5.4%
301402
 
1.1%
3029
 
< 0.1%
30359
 
0.2%
ValueCountFrequency (%)
306113
 
0.3%
30515
 
< 0.1%
30420
 
0.1%
30359
 
0.2%
3029
 
< 0.1%
301402
 
1.1%
2052081
5.4%
204416
 
1.1%
2034
 
< 0.1%
202609
 
1.6%

marc_pbf
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Memory size298.9 KiB
1.0
23737 
0.0
14510 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters114741
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row1.0
3rd row0.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.023737
62.1%
0.014510
37.9%
(Missing)1
 
< 0.1%

Length

2023-03-24T16:04:02.134454image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-03-24T16:04:02.335463image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
1.023737
62.1%
0.014510
37.9%

Most occurring characters

ValueCountFrequency (%)
052757
46.0%
.38247
33.3%
123737
20.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number76494
66.7%
Other Punctuation38247
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
052757
69.0%
123737
31.0%
Other Punctuation
ValueCountFrequency (%)
.38247
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common114741
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
052757
46.0%
.38247
33.3%
123737
20.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII114741
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
052757
46.0%
.38247
33.3%
123737
20.7%

qtde_pessoas
Real number (ℝ≥0)

Distinct14
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean2.921562475
Minimum1
Maximum14
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size298.9 KiB
2023-03-24T16:04:02.461607image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q34
95-th percentile6
Maximum14
Range13
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.663634561
Coefficient of variation (CV)0.5694331628
Kurtosis1.644767687
Mean2.921562475
Median Absolute Deviation (MAD)1
Skewness1.079867492
Sum111741
Variance2.767679952
MonotonicityNot monotonic
2023-03-24T16:04:02.649493image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
29276
24.3%
38771
22.9%
18331
21.8%
45964
15.6%
53064
 
8.0%
61542
 
4.0%
7692
 
1.8%
8361
 
0.9%
9145
 
0.4%
1059
 
0.2%
Other values (4)42
 
0.1%
ValueCountFrequency (%)
18331
21.8%
29276
24.3%
38771
22.9%
45964
15.6%
53064
 
8.0%
61542
 
4.0%
7692
 
1.8%
8361
 
0.9%
9145
 
0.4%
1059
 
0.2%
ValueCountFrequency (%)
141
 
< 0.1%
136
 
< 0.1%
1213
 
< 0.1%
1122
 
0.1%
1059
 
0.2%
9145
 
0.4%
8361
 
0.9%
7692
 
1.8%
61542
4.0%
53064
8.0%

peso.fam
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct101
Distinct (%)0.3%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean5.0534024 × 1014
Minimum5.50304564 × 1012
Maximum5.504777045 × 1014
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size298.9 KiB
2023-03-24T16:04:02.844058image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum5.50304564 × 1012
5-th percentile5.502224423 × 1013
Q15.502030765 × 1014
median5.502451463 × 1014
Q35.502556682 × 1014
95-th percentile5.503007328 × 1014
Maximum5.504777045 × 1014
Range5.449746588 × 1014
Interquartile range (IQR)5.259172145 × 1010

Descriptive statistics

Standard deviation1.441889458 × 1014
Coefficient of variation (CV)0.2853304257
Kurtosis6.479926469
Mean5.0534024 × 1014
Median Absolute Deviation (MAD)1.933114247 × 1010
Skewness-2.907525222
Sum1.932774816 × 1019
Variance2.07904521 × 1028
MonotonicityNot monotonic
2023-03-24T16:04:03.062918image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5.502451463 × 101411305
29.6%
5.502258152 × 10143726
 
9.7%
5.502224423 × 10131926
 
5.0%
5.502705077 × 10141690
 
4.4%
5.502705528 × 10141044
 
2.7%
5.501863228 × 1014933
 
2.4%
5.502618074 × 1014914
 
2.4%
5.502148111 × 1014908
 
2.4%
5.502556682 × 1014875
 
2.3%
5.500689349 × 1014756
 
2.0%
Other values (91)14170
37.0%
ValueCountFrequency (%)
5.50304564 × 1012508
 
1.3%
5.503181124 × 1012305
 
0.8%
5.500494516 × 1013100
 
0.3%
5.50126943 × 1013152
 
0.4%
5.501544022 × 1013127
 
0.3%
5.501577337 × 1013131
 
0.3%
5.502224423 × 10131926
5.0%
5.502316127 × 101346
 
0.1%
5.502363543 × 101391
 
0.2%
5.499916609 × 101410
 
< 0.1%
ValueCountFrequency (%)
5.504777045 × 10146
 
< 0.1%
5.504410327 × 10147
 
< 0.1%
5.50377509 × 1014153
0.4%
5.503772467 × 1014183
0.5%
5.503606198 × 1014183
0.5%
5.503557046 × 101427
 
0.1%
5.503496942 × 101417
 
< 0.1%
5.503459528 × 101428
 
0.1%
5.503448472 × 101484
0.2%
5.503426597 × 101422
 
0.1%

Interactions

2023-03-24T16:03:45.554010image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:02.484010image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:05.217112image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:08.185518image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:10.938410image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:14.842018image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:17.708379image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:20.486458image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:23.643033image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:27.234320image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:29.944112image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:33.071345image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:35.704967image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:39.134697image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:42.296872image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:45.727760image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:02.686403image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:05.403312image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:08.366377image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:11.223674image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:15.064129image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:17.897954image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:20.683503image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:23.869430image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:27.423240image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:30.141184image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:33.258182image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:35.905782image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:39.433281image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:42.491477image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:45.898527image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:02.867460image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:05.562509image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:08.539770image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:11.486677image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:15.231593image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:18.065359image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:21.161337image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:24.100383image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:27.603869image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:30.318880image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:33.423850image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:36.086866image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:39.661365image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:42.679311image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:46.063780image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:03.039645image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:05.727218image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:08.701082image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:11.715230image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:15.412984image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:18.256382image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:21.360332image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:24.371650image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:27.773735image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:30.511945image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:33.598168image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:36.296307image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:39.907567image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:42.885995image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:46.238412image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:03.216772image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:05.914164image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:08.885217image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:12.003471image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:15.619590image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:18.457388image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:21.548463image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:24.662594image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:27.968579image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:30.705688image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:33.773575image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:36.488044image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:40.196271image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:43.075687image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:46.418314image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:03.390079image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:06.089108image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:09.064548image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:12.291967image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:15.800161image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:18.648023image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:21.751188image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:24.884926image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:28.145483image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:30.890500image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:33.951140image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:36.690991image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:40.465687image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:43.264762image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:46.605908image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:03.582743image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:06.589273image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:09.246640image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:12.567788image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:15.986241image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:18.821267image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:21.935704image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:25.101602image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:28.329809image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:31.071043image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:34.120062image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:36.870312image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:40.643429image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:43.444348image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:46.820461image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:03.771795image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:06.772915image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:09.430468image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:12.834180image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:16.189148image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:19.013022image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:22.125614image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:25.397811image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:28.516658image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:31.640941image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:34.298141image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:37.081475image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:40.850292image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:43.625892image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:47.004608image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:03.963719image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:06.961342image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:09.621199image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:13.420202image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:16.372507image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:19.192872image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:22.322381image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:25.623411image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:28.692648image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:31.823120image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:34.463541image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:37.316938image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:41.031760image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:44.324939image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:47.179958image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:04.129189image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:07.131421image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:09.792684image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:13.710792image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:16.560290image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:19.377687image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:22.497860image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:25.848126image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:28.858184image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:31.996340image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:34.649797image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:37.563471image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:41.221036image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:44.493290image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:47.343892image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:04.302592image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:07.311066image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:09.976746image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:13.913500image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:16.730676image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:19.555757image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:22.678327image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:26.059534image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:29.039459image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:32.171231image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:34.821955image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:37.858301image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:41.397693image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:44.666681image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:47.509083image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:04.469200image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:07.469965image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:10.142122image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:14.091281image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:16.901459image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:19.727419image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:22.854534image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:26.248678image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:29.199800image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:32.338410image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:34.988069image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:38.101149image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:41.571292image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:44.833593image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:47.722287image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:04.667482image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:07.671328image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:10.340627image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:14.289087image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:17.119457image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:19.925045image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:23.056306image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:26.510643image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:29.394326image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:32.550458image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:35.181042image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:38.359228image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:41.766205image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:45.028360image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:47.901992image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:04.857001image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:07.840127image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:10.518816image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:14.475617image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:17.311455image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:20.109336image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:23.259587image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:26.762482image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:29.577627image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:32.723011image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:35.349753image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:38.597085image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:41.940926image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:45.206666image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:48.070718image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:05.036935image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:08.023076image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:10.704268image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:14.660747image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:17.510432image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:20.298594image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:23.461088image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:26.990687image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:29.759129image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:32.905393image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:35.531447image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:38.861644image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:42.116679image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-24T16:03:45.381194image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Correlations

2023-03-24T16:04:03.281798image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2023-03-24T16:04:03.651839image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2023-03-24T16:04:04.181772image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2023-03-24T16:04:04.670463image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2023-03-24T16:04:04.995608image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2023-03-24T16:03:48.444353image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
A simple visualization of nullity by column.
2023-03-24T16:03:49.345875image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-03-24T16:03:50.115538image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2023-03-24T16:03:50.608744image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

cd_ibgeestratoclassfid_familiadat_cadastramento_famdat_alteracao_famvlr_renda_media_famdat_atualizacao_familiacod_local_domic_famcod_especie_domic_famqtd_comodos_domic_famqtd_comodos_dormitorio_famcod_material_piso_famcod_material_domic_famcod_agua_canalizada_famcod_abaste_agua_domic_famcod_banheiro_domic_famcod_escoa_sanitario_domic_famcod_destino_lixo_domic_famcod_iluminacao_domic_famcod_calcamento_domic_famcod_familia_indigena_famind_familia_quilombola_famnom_estab_assist_saude_famcod_eas_famnom_centro_assist_famcod_centro_assist_famind_parc_mds_fammarc_pbfqtde_pessoaspeso.fam
032050022212018-06-282018-10-02244.02018-06-281.01.05.02.05.01.01.01.01.01.02.01.01.022.0NaNNaNCRAS DE SERRA SEDE3.205003e+100.00.05.05.502565e+14
132051012232018-08-272018-11-2960.02018-11-291.01.05.02.05.01.01.01.01.01.01.01.01.022.0NaNNaNCRAS VIANA3.205103e+100.01.05.05.503557e+14
232013082242018-02-232018-02-27937.02018-02-231.01.04.01.02.02.01.01.01.01.01.01.03.022.0NaNNaNCRAS IV ALTO MUCURI3.201300e+100.00.01.05.502597e+14
332013082262013-12-272018-10-0144.02017-06-221.01.04.01.02.02.01.01.01.01.01.02.03.022.0US CAMPO VERDE2652994.0CRAS III CAMPO VERDE3.201300e+100.01.02.05.502597e+14
432050022272018-03-262018-03-280.02018-03-261.01.04.01.05.01.02.04.01.05.03.01.03.022.0UNIDADE REGIONAL DE SAUDE SERRA2465795.0NaNNaN0.01.02.05.502565e+14
532050022282016-10-272018-10-01176.02016-10-271.01.06.03.05.01.01.01.01.01.01.01.02.022.0UNIDADE BASICA DE SAUDE VILA NOVA DE COLARES2522845.0CRAS DE VILA NOVA DE COLARES3.205000e+100.01.05.05.502565e+14
632052002292015-06-162018-10-01312.02018-03-201.01.05.02.05.01.01.01.01.03.01.01.03.022.0UNIDADE DE SAUDE DA FAMILIA DE ULISSES GUIMARAES3346501.0CRAS JABAETE3.205202e+100.00.03.05.502451e+14
7320130822102017-04-052018-10-01954.02018-07-041.01.01.01.05.01.01.01.01.01.01.02.01.022.0NaNNaNCRAS VII SOTELANDIA3.201304e+100.00.01.05.502597e+14
8320520022112018-10-032018-10-15477.02018-10-031.01.05.02.05.01.01.01.01.01.01.01.01.022.0UNIDADE DE SAUDE DA FAMILIA DE TERRA VERMELHA2403412.0CRAS MORADA DA BARRA3.205200e+100.00.02.05.502451e+14
9320500222122016-05-112016-05-114.02016-05-111.01.01.01.05.01.01.01.01.01.01.01.01.022.0NaNNaNNaNNaN0.01.03.05.502565e+14

Last rows

cd_ibgeestratoclassfid_familiadat_cadastramento_famdat_alteracao_famvlr_renda_media_famdat_atualizacao_familiacod_local_domic_famcod_especie_domic_famqtd_comodos_domic_famqtd_comodos_dormitorio_famcod_material_piso_famcod_material_domic_famcod_agua_canalizada_famcod_abaste_agua_domic_famcod_banheiro_domic_famcod_escoa_sanitario_domic_famcod_destino_lixo_domic_famcod_iluminacao_domic_famcod_calcamento_domic_famcod_familia_indigena_famind_familia_quilombola_famnom_estab_assist_saude_famcod_eas_famnom_centro_assist_famcod_centro_assist_famind_parc_mds_fammarc_pbfqtde_pessoaspeso.fam
38238120034413457662016-02-292018-10-0110.02018-02-221.02.0NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN22.0UNIDADE DE SAUDE JOSEFA NUNES2000962.0NaNNaN0.01.05.05.502451e+14
38239120045023457672011-12-052018-09-260.02018-05-111.01.05.02.05.01.01.02.01.02.01.01.03.022.0NaNNaNNaNNaN0.00.01.05.501544e+13
38240120060923457682002-06-062018-09-30133.02017-08-111.01.02.01.04.03.01.02.01.03.01.01.01.022.0UNIDADE DE SAUDE DA FAMILIA 24 DE ABRIL3901009.0NaNNaN0.00.02.05.501311e+14
38241120033613457702002-05-252018-10-1665.02017-07-102.01.06.03.02.03.01.01.01.03.03.01.01.022.0UNIDADE S FAMILIA DR CERQUEIRA2002272.0NaNNaN0.01.04.05.502451e+14
38242120033613457712007-12-142018-09-3066.02017-04-261.01.03.02.04.03.02.02.01.03.03.01.03.022.0U S F QUINTINO RIO BRANCO LEBRE2000105.0NaNNaN0.01.03.05.502451e+14
38243120050023457722006-12-082018-09-306.02017-11-062.01.04.02.04.03.02.02.02.0NaN3.06.03.022.0ESF MODULO I5981891.0NaNNaN0.01.06.05.503606e+14
38244120020323457732003-03-132018-09-300.02017-10-022.01.01.01.03.06.02.02.01.03.01.02.02.022.0NaNNaNNaNNaN0.01.05.05.503181e+12
38245120080713457752007-12-172016-05-1762.02016-05-171.01.05.02.04.03.02.02.01.03.01.01.03.022.0UNIDADE DE SAUDE DA FAMILIA OSWALDO CRUZ2001101.0CRAS CENTRO DE REFERENCIA DE ASSISTENCIA SOCIAL1.200801e+10NaN1.07.05.502451e+14
38246120020323457762013-10-222018-10-01275.02017-09-061.01.04.02.04.03.01.01.01.03.02.01.01.022.0NaNNaNNaNNaN0.00.03.05.503181e+12
38247120080713457772018-04-092018-10-01593.02018-04-102.01.03.01.03.06.02.02.02.0NaN3.01.03.02NaNNaNNaNNaNNaNNaNNaNNaNNaN